A scored AUC Metric for Classifier Evaluation and Selection

نویسندگان

  • Shaomin Wu
  • Peter Flach
چکیده

The area under the ROC (Receiver Operating Characteristic) curve, or simply AUC, has been widely used to measure model performance for binary classification tasks. It can be estimated under parametric, semiparametric and nonparametric assumptions. The non-parametric estimate of the AUC, which is calculated from the ranks of predicted scores of instances, does not always sufficiently take advantage of the predicted scores. This problem is tackled in this paper. On the basis of the ranks and the original values of the predicted scores, we introduce a new metric, called a scored AUC or sAUC. Experimental results on 20 UCI data sets empirically demonstrate the validity of the new metric for classifier evaluation and selection.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

NEW CRITERIA FOR RULE SELECTION IN FUZZY LEARNING CLASSIFIER SYSTEMS

Designing an effective criterion for selecting the best rule is a major problem in theprocess of implementing Fuzzy Learning Classifier (FLC) systems. Conventionally confidenceand support or combined measures of these are used as criteria for fuzzy rule evaluation. In thispaper new entities namely precision and recall from the field of Information Retrieval (IR)systems is adapted as alternative...

متن کامل

An Improved Model Selection Heuristic for AUC

The area under the ROC curve (AUC) has been widely used to measure ranking performance for binary classification tasks. AUC only employs the classifier’s scores to rank the test instances; thus, it ignores other valuable information conveyed by the scores, such as sensitivity to small differences in the score values. However, as such differences are inevitable across samples, ignoring them may ...

متن کامل

Evaluation of Classifiers in Software Fault-Proneness Prediction

Reliability of software counts on its fault-prone modules. This means that the less software consists of fault-prone units the more we may trust it. Therefore, if we are able to predict the number of fault-prone modules of software, it will be possible to judge the software reliability. In predicting software fault-prone modules, one of the contributing features is software metric by which one ...

متن کامل

An Empirical Evaluation of Ranking Measures With Respect to Robustness to Noise

Ranking measures play an important role in model evaluation and selection. Using both synthetic and real-world data sets, we investigate how different types and levels of noise affect the area under the ROC curve (AUC), the area under the ROC convex hull, the scored AUC, the Kolmogorov-Smirnov statistic, and the H-measure. In our experiments, the AUC was, overall, the most robust among these me...

متن کامل

Choosing the Best Classification Performance Metric for Wrapper-based Software Metric Selection for Defect Prediction

Software metrics and fault data are collected during the software development cycle. A typical software defect prediction model is trained using this collected data. Therefore the quality and characteristics of the underlying software metrics play an important role in the efficacy of the prediction model. However, superfluous software metrics often exist. Identifying a small subset of metrics b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005